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Description 

FIELD OF TECHNOLOGY 

5 The present invention relates to a novd DNA and a process for preparing a protein which possesses an activity to 
inhibit osteoclast differentiation and/or maturation (hereinafter called osteoclastogenesls-inhMory activity) by a genetic 
engineering technique using the DMA. More particularly, the present invention relates to a genomic DNA encoding a 
protein OCIF which possesses an osleodastogenesis-inhibitory activity and a process for preparing said protein by a 
genetic engineering technique using the genomic DNA. 

10 

BACKGROUND OF THE INVENTION 

Human bones are constantly repeating a process of resorption and formation. Osteoblasts controlling fornriation of 
bones and osteoclasts controlling resorption of bones take major roles in this process. Osteoporosis is a typical disease 

IS caused by abnormal metabolism of bones. This disease is caused when bone resorption by osteoclasts exceeds bone 
formation by osteoblasts. Although the mechanism of this disease Is still to be elucidated completely, the disease 
causes the bones to ache, makes the bones fragile, and may results in fracturing of the bones. As the population of the 
aged increases, this disease results in an increase in bedridden aged people which becomes a social problem. Urgent 
development of a therapeutic agent for this disease is strongly de^red. Disease due to a decrease in t>one mass is 

20 expected to be treated by controlling bone resorption, accelerating bone formation, or improving balance between bone 
resorption and formation. 

Osteogenesis is expected to Increase by accelerating proliferation, differentiation, or activation of the cells control- 
ling bone formation, or by conti-olling proliferation , differentiation, or activation of the cells involved in bone resorption. 
In recent years, strong interest has been directed to physiologically active proteins (cytokines) exhibiting such activities 

25 as described above, and energetic research is ongoing on tiiis subject. The cytokines which have been reported to 
accelerate proliferation or differentiation of osteoblasts include the proteins of f toroblast growtii factor family (FGF: 
Rodan S. B. etal.. Endocrinology vol. 121, p 1917, 1987), insulin-like growth factor I (IGF-I: Hock J. M. etal.. Endocrinol- 
ogy vol. 122, p 254, 1988), insulin growth factor II (IGF-II: McCarthy T et aL, Endocrinology vol. 124. p 301, 1989). 
Activin A (Centrella M. et al.. Md. Cell. Biol., vol. 11. p 250, 1991). transforming growtti factor-p. (Noda M.. The Bone. 

30 vol. 2. p 29, 1988), Vasculotropin (Varonique M. et al., Biochem. Biophys. Res. Commun., vol. 199, p 380, 1994), and 
the protein of heterotopic bone formation factor family (bone morphogenic protein; BMP: BMP-2; \knaguchi A. et al., J. 
Cell Biol. vol. 113. p 682, 1991, OP-1; Sampath T K. et al., J. Biol. Chem. vol. 267, p 20532. 1992, and Knutsen R. et 
al., Biochem. Biophys. Res. Comnrtun. vol. 194, P 1352, 1993). 

On the other hand, as the cytokines which suppress differentiation and/or maturation of osteoclasts, transforming 

35 growtti factor-p (Chenu C, et al.. Proc. Nati. Acad. Sd. USA. vol. 85, p 5683, 1988). interleukin-4 (Kasano K. et al., 
Bone-Miner., vol. 21. p 179. 1993), and the like have been reported. Further, as the cytokines which suppress bone 
resorption by osteoclast, calcitonin (Bone-Miner., vol. 1 7, p 347. 1 992 ), macrophage colony stimulating factor (Hatters- 
ley G. et al., J. Cell. Physiol, vol. 137, p 199. 1988), interieukin-4 (Watanabe, K. et al.. Biochem. Biophys. Res. Com- 
mun. vol. 172. P 1035. 1990). and interferon-y (Gowen M. et al., J. Bone Miner Res., vol. I, p 46.9, 1986) have been 

40 reported. 

These cytokines are expected to be used as agents for treating diseases accompanying bone toss by accelerating 
bone formatfon or suppressing of bone resorption. Clinical tests are being undertaken to verify tiie effect of improving 
bone metabolism of some cytokines such as insulin-like growth factor-l and the heterotopic bone formation factor family 
In addition, calcitonin is already commercially available as a therapeutic agent for osteoporosis and a pain relief agent. 

45 At present, drugs for clinically treating bone diseases or shortening the period of treatment of bone diseases include 
activated vitamin D3, calcitonin and its derivatives, and hormone preparations such as estradiol agent, iprrflavon or cal- 
cium preparations These agents are not necessarily satisfactory In temns of the efficacy and tiierapeutic results. Devel- 
opment of a novel tiierapeutic agent which can be used in place of these agents is strongly desired. 

In view of this situation, tine present inventors have undertaken extensive studies. As a result tiie present inventors 

50 had found protein OCIF exhibiting an osteoclastogenesis-inhibitory activity in a culture broth of human embryonic lung 
fibroblast IMR-90 (ATCC Deposition Na CCL1 86). and filed a patent appfication (PCT/JP96/00374). The present inven- 
tors have conducted furttier studies relating to the origin of this protein OCIF exhibiting the osteoclastogenesis-inhibi- 
tory activity The studies have matured into determination of the sequence of a genomic DNA encoding the human 
origin OCIF. Accordingly, an object of the present invention Is to provide a genomic DNA encoding protein OCIF exhib- 

55 Iting osteodastogenesis-inhibitory activity and a process for preparing tiiis protein by a genetic engineering technk^ue 
using the genomic DNA. 
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DISCLOSURE OF THE INVENTION 

Specifically, the present invention r^ates to a genomic DNA encoding protein OCIF exhibiting osteodastogenesis- 
inhibitory activity and a process for preparing this protein by a genetic engineering technique using the genomic DNA. 
The DNA of the present Invention includes the nucleotide sequences No. 1 and No. 2 in the Sequence Table attached 
hereta 

Moreover, the present invention relates to a process for preparing a protein, comprising inserting a DNA including 
the nucleotide sequences of the sequences No. 1 and No. 2 in the Sequence Table into an expression vector, producing 
a vector capable of expressing a protein having the following physicochemical characteristics and exhibiting the activity 
of inhtoiting differentiation and/or maturation of osteoclasts, and producing this protein by a genetic engineering tech- 
nique. 

(a) nrtolecular weight (SDS-PAGE): 

(i) Under reducing conditions: about 60 kD. 

(ii) Under non-reducing conditions: about 60 kD and about 120 kD; 

(b) amino acid sequence: 

includes an amino add sequence of the Sequence ID No. 3 of tiie Sequence Table. 

(c) affffiity: 

exhibits affinity to a cation exchanger and heparin, and 

(d) tiiermal stability: 

(i) the osteoclast differentiation and/or maturation inhibitory activity is reduced when treated with heat at 70*0 
for 1 0 minutes or at 56**C for 30 minutes, 

00 tile osteoclast differentiation and/or maturation inhibitory activity is lost when treated witfi heat at 90*C for 
10 minutes. 

The protein obtained by expressing the gene of ttie present invention exhibits an osteoclastogenesis-inhMory 
activity. This protein is effective as an agent for the treatment and improvement of diseases involving deaease in tiie 
amount of bone such as osteoporosis, diseases relating to bone metabolism abnormality such as rheumatism, degen- 
erative joint disease, or multiple myeloma, and is useful as an antigen to establish an immunological diagnosis of such 
diseases. 

BRIEF DESCRIPTION OF TH^ DRAWINGS 

Hgure 1 shows a result of Western Blotting analysis of the protein obtained by causing genomic DNA of the present 
invention to express a protein in Example 4 (iii). wherein lane 1 indicates a marker, lane 2 indicates the culture brotti of 
C0S7 cells in which a vector pWESRaOCIF (Exanpie 4 (iii))has been transfected, and lane 3 is the culture broth of 
C0S7 cell in which a vector pWESRa(control) has been transfected. 

BEST MODE FOR CARRYING OUT THE INVENTION 

The genomic DNA encoding tiie protein OCIF which exhawts osteodastogenesis-inhibitory activity in ttie present 
inventbn can be obtained by preparing a cosmid library using a human placenta genomic DNA and a cosmid vector 
and by saeening tiiis library using DNA fragments which are prepared based on ttie OCIF cDNA as a probe. The ttius- 
obtained genomic DNA is inserted into a suitable expression vector to prepare an OCIF expression cosmid. A recom- 
binant type OCIF can be obtained by transfecting ttie genomic DNA into a host organism such as various types of cells 
or miaoorganism strains and causing the DNA to express a protein by a conventional method. The resultant protein 
exhibiting osteoclastogenesis-inhtoitory activity (an osteoclastogenesis-inhibitory factor) is useful as an agent for ttie 
treatment and improvement of diseases involving a decrease in bone mass such as osteoporosis and other diseases 
relating to bone metabolism abnomriality and also as an antigen to prepare antibodies for establishing immunological 
diagnosis of such diseases. The protein of tfie present invention can be prepared as a dmg composition for oral or non- 
oral administration. Specifically, ttie drug composition of the present invention containing the protein which is an osteo- 
clastogenesis-inhibitory factor as an active ingredient can be safely administered to humans and animals..As ttie.form 
of drug composition, a composition for injection, composition for intravenous drip, suppository, nasal agent, sutilingual 
agent percutaneous absorption agent and tfie like are given. In tfie case of tfie composition for ii^ection. such a com- 
position is a mixture of a phamiacologically effective amount of osteoclastogenesis-inhibitory factor of the present 
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invention and a phannaceuf cally acceptable carrier. The composition may further comprise amino adds, saccharides 
cellulose derivatives, and other exdpients andtor activation agents, including other organic conpounds and inorganid 
compound which are commonly added to a composition for injection. When an injection preparation is prepared using 
the osteodastogenesis-inhibilory lactor of the present invention and these excipients and activation agents a pH 
adjuster, buffenng agent, stabilizer, solubiiizing agent and the lite may be added if necessary to prepare various types 
of injection agents. 

The present invention will now be described in more detail by way of examples which are gVen for the purpose of 
illustraton and not intended to be Dmitng of the present invention. 

Example 1 

(Preparation of a cosmid lihrary ) 

A cosmid library was prepared using human placenta genomic DNA (Clonetech; Cat No. 6550-2) and pWE1 5 cos- 
mid vector (Stratagene). The experiment was earned out following (wincipally the protocol attached to the pWEI 5 cos- 
mid vector kit of Stratagene Company, provided Molecular Cloning: A Laboratory Mannual (Cold Spring Harbor 
Laboratory (1989)) was refen-ed to for common procedures for handing DNA. E. coli. and pharge. 

ffl Preparation of resfr ictive enzvmolvsate of human-oenomie DNA 

Human placenta genomic DNA dissolved in 750 jil of a solution containing lOmMTris-HCi. lOmMMgCI, and 100 
mM Naa was added to four 1 .5 ml Eppendorf tubes (tube A. B. C. and D) in the amount of 100 jig each Restriction 
enzyme IVtt)olwasaddedto these tubesin the amountsof05unitfortubeA.0.4unitfortubeB 06 unit for tube C and 
O^unit for tube D. and DNA was digested for 1 hour. Then. EDTA in the amount to mate a 20 mM concentration' was 
added to each tube to terminate the reaction, followed by extraction with phenol/chlorofomi (1 :1). A iwo-foW amount of 
ethanol was added to the aqueous layer to precipitate DNA. DNA was collected by centrifugation. washed with 70% eth- 
anol. and DNA in each tube was dissolved in 100 jj of TE (1 0 mM HCI (pH 8.0) + 1 mM EDTA buffer solution, hereinafter 
caDed TE). DNA in four tubes was combined in one tube and inoiaated for 10 minutes at esx. After cooling to room 
temperature, the mixture was overlayed onto a 10%-40 % linear sucrose gradient which was prepared in a buffer con- 
taining 20 mM Tris-HCI (pH 8.0). 5 mM EDTA. and 1 mM NaCI in an centrifugal tube (38 ml). The tube was centrifuged 
at 26.000 rpm for 24 hours at 20°C using a rotor SRP28SA manufactured by Hitachi. LW. and 0.4 ml fractions of the 
sucrose gradient was collected using a fraction collector. A portion of each fi-actfon was subjected to 0 4% agarose elec- 
trophoresis to confirm the size of DNA. Fracfions containing DNA vnth a length of 30 kb (Wlo base pair) to 40 kb were 
thus combined. The DNA solution was diluted vyith TE to mate a sucrose concentration to 10% or less and 2 5-fold vol- 
umes of ethanol was added to precipitate DNA. DNA was dissolved in TE and stored at 4''C. 

(ii) Preparation of cnstmirt vartnt 

The pWEIS cosmid vector obtained from Stratagene Company was completely digested with restricfion enzyme 
BarnHI accoiding to the protocol attached to the cosmid vector Wt. DNA collected l>y ethanol precipitation was dissolved 
in TE to a concentration of 1 mgMil . Phosphoric add at the S'-end of this DNA was removed using caH small intestine 
altehne phosphatase, and DNA was collected by phenol extraction and ethanol precipitation. The DNA was dissolved 
in TE to a concentration of 1 mg/ml. 

(iii) Ligation of genomic DNA to vector and in vitro packaging 

1.5 micrograms of genomic DNA fractionated according to size and 3 Mg of pWE15 cosmid vector which vras 
digKted with restriction enzyme BamHI were ligated in 20 ,il of a reaction solution using Ready-To-Go T4DNA ligase 
of Pharmacia Company. The ligated DNA was padaged in vitro using Gigapa*™ II padeging extract (Stratagene) 
according to Ihe protocoL After the padaging reaction, a portion of the reaction mixture was diluted stepwise with an 
SM buffer solution and mixed with E. coli XLI-Blue MR (Stratagene) whidi was suspended inIO mM MgClg to cause 
pharge to infect, and plated onto LB agar plates containing 50 wi/ml of ampidina The number of cotonies produced was 
counted. The number of cotonies per 1 ,J of padoging reaction was calculated based on this result 

fiv) Preparation of a tyis mid Iferarv 

The padeging reacfion solution thus prepared was mixed with E. coli XLI-Blue MR and the mixture was plated 
onto agarose plates containing ampidllin so as to produce 50.000 colonies per agarose plate having a 15 cm of diam- 
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eter. After incubating the plate overnight at 37^C, an LB culture medium was acUed in the amount of 3 ml per plate to 
suspend and cdtect colonies of E. coll. Each agarose plate was again washed with 3 ml of the LB culture medium and 
the wasNng was combined with the original suspension of E, coli. The E. coO collected from all agarose plates was 
placed in a centrifugal tube, glycerol was added to a concentration of 20%. and ampidllin was further added to make a 
5 final concentration of 50 ^gAnl . A portion of the E. coli suspension was removed and tiie remainder was stored at - 
SO^'C. The rOTOved E. coli was diluted stepwise artd plated onto an agar plates to count the number of colonies per 1 
ml of suspension. 

Example 2 

10 

(Screening of cosmid library and purification of colony) 

A nitrocellulose filter (Millipore) with a diameter of 14.2 cm was placed on each LB agarose plate with a diameter 
of 1 5 cm which contained 50 ^igAnl of ampidllin. The cosmid Ibrary was plated onto the plates so as to produce 50,000 

IS colonies of E. coil per plate, followed by incubation overnight at E. coli on the nitrocellulose filter was transferred 
to another nitrocellulose filter according to a conventional method to obtain two replica filters. According to the protocol 
attached to the cosmid vector Kit cosmid DNA in the E. coli on the replica filters was denatured with an alkali, neutral- 
ized,vand immobilized on tiie nitrocellulose f 3ter using a Stratalinker (Stratagene). The filters were heated fa two hours 
at SO^'C in a vacuum oven. The nitroc^lulose fitters thus obtained were hybridized using two kinds of DNA produced. 

20 respectively, from 5*-end and 3 -end of human OCIF cDNA as probes. Mamely. a plasmid was purified from E. coll 
pKB/OIFlO (depc^ited at The Ministry of International Trade and Industry, the Agency of Industrial Sdence and Tech- 
nology, Biotechnology Laboratory, Deposition No. PERM BP-5267) containing OCIF cDNA. The plasmid containing 
OCIF cDNA was digested with restriction enzymes Kpnl and EcoRI. Fragments thus obtained was separated using 
agarose gel electrophoresis. Kpnl^ooRI fragment with a length of 0.2 kb was purified using a QIAEX II gel extraction 

25 kit (Qiagen). This DNA was lat>6led with using the Megaprime DNA Labeling System (Amasham) (5*-DNA probe). 
Apart from this, a BamHI/EcoRV fragment with a length of 0.2 kb which was produced from tiie above plasmid by diges- 
tion with restriction enzymes BamHI and EcoRV was purified and labeled with ^^p (3*-DNA probe). One of tfie replica 
filters described above was hybridized with the 5'-DNA probe and tiie other.vyith the.3'-PNA probe. Hybridization and 
washing of the fitters were camed out according to the protocol attached to tiie cosmid vector kit. Autoradiography 

30 detected several positive signals with each probe. One colony which gave positive signals with both probe was Kienti- 
f led. The colony on tiie agar plate, which corresponding to tiie signal on the autoradiogram was isolated and purified. 
A cosmid was prepared from the purified colony by a conventional mettiod. This cosmid was named pWEOCIR The 
size of human genomic DNA contained in tills cosmid was about 38 kb 

35 Exanple 3 

( Determination of tiie nucleotide sequence of human OCIF genomic DNA ) 

(i) SpWoning Qf OQIF gffl i pmiQ DNA 

40 

Cosmid pWEOCIF was digested witti restriction enzyme EcoRl. After the separation of tiie DNA fragments thus 
produced by electrophoresis using a 0.7% agarose gel. the DNA fragments were t'ansferred to a nylon membrane 
(Hybond -N. Amasham) by the Southern ^ot technique and immobifized on tiie nylon membrane using Stratalinker 
(Stratagene). On the other hand, plasmid pBKOCIF was digested wrtii restriction enzyme EcoRI and a 1 .6 kb fragment 

'45 containing human OCIF cDNA was solated by agarose gel electrophoresis^ The fragmait was labeled with ^^P using 
the Megaprime DNA labeling system (Amasham). 

Hybridization of the nylon membranes described above with the ^P-iat>eled 1.6-kb OCIF cDNA was performed 
according to a conventional method detected that DNA fragments with a size of 6 kb, 4 kb. 3.6 kb. and 2.6 Ms. These 
fragments hybridized with the human OCIF cDNA were isolated using agarose gel electrophoresis and individually sub- 

50 cloned into an EcoRI site of pBluescript II SK + vector (Strategene) by a conventional method. The resulting plasmids 
were respectively named pBSE 6. pBSE 4. pBSE 3.6, and PBSE 2.6. 

(ii) Detemmnation of the nudeotide sequence . 

55 The nucleotide sequence of human OCIF genomic DNA which was sutx:loned into the plasmid was determined 
usirtg tiie ABI Dideoxy Tenminator Cyde Sequendng Ready Reaction kit (Perkin Elmer) and the 373 Sequendng Sys- 
tem (Applied Biosystems). The primer used for the determination of the nudeotide sequence was synthesized based 
on the nudeotide sequence of human OCIF cDNA (Sequence ID No. 4 in tiie Sequence Table). The nudeotide 
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pie which was diluted with a-MEM culture medium containing 1x10*® M activated vitamin D3 and 10% fetal bovine 
semm was added. After 7 days from the start of culturing. the cells were washed with a phosphate buffered saline ami 
fixed with a ethanot/acetone (1 :1) solution for one minute at room terrperature. TTie osteoclast fomiation was detected 
by staining the cells using an adcfic phosphatase activity measurement kit (Add Phosphatase. Leucocyte, Cat.No. 387- 
5 A, Sigma Company). A decrease in the number of cells positive to acidic phosphatase activity in the presence of tartaric 
acid was taken as the OCIF activity. The results are shown in Table 1, which indicates that the conditioned medium 
exhibits the similar activity to natural type OCIF obtained from the IMR-90 culture medium and recombinant OCIF pro- 
duced by CHO cells. 

10 

TABLE 1 



Activity of OCIF expressed by COS-7 cells in the conditioned medium 


Dilution 


1/10 


1/20 


1/40 


1/80 


1/160 


1/320 


OCIF genomic DNA introduced 
Vector introduced 
Untreated 


++ 


++ 


++ 


++ 


+ 




"++" indicates an activity inhibiting 80% or more of osteoclast fbnnalton, indicates an activity inhibiting 30-80% 
of osteoclast formation, and indicates that no inhibition of osteoclast formation is observed. 



(iiO Mentif ication of the product bv Western Blotting 

25 A buffer solution (10 ^il) for SDS-PAGE (0.5 M Tris-HCI. 20% glycerol. 4% SDS. 20 jig/ml bromophend blue. pH 
6.8) was added to 10 ^1 of the sample lor the measurement of OCIF activity prepared in (ii) above. After boiling for 3 
minutes at 100*C. the mixture was subjected to 10% SDS polyacrylamide electrophoresis under non-reducing condi- 
tions. The proteins were transfenred from the gel to a PVDF membrane (ProBIott. Perkin Elmer) using semi-dry blotting 
apparatus (Biorad). The membrane was blocked and incubated for 2 hours at 37*'C together with a hoiseracflsh perox- 

30 idase-labeled anti-OCIF antibody obtained by labeling the previously obtained OCIF protein with horseradish peroxi- 
dase according to a conventional method. After washing, the protein whrch has bound the anti-OCIF antibody was 
detected using the ECL system (Amasham). As showfl in Figure 1, two bands, one with a molecular weight of about 
120 kilo dalton and the other 60 Wlo dalton, were detected in the supernatant obtained from the culture broth of COS- 
7 cells in which pWESf^aOCIF was transfected. On the other hand, tii^e two bands with a molecular weight of about 

35 120 kilo dalton and 60 Wlo dalton were not delected in the supernatant obtained from the culture broth of COS-7 cells 
in which pWESRocvector was transfected. confirming that the protein obtained was OCIF. 

INDUSTRIAL APPLICABILITY 

40 The present invention provides a genomic DNA encoding a protein OCIF which possesses an osleodastogenesis- 
inhibitory activity and a process for preparing this protein by a genetic eigineering technk^ue using the genomic DNA. 
The protein obtained by expressing ttie gene of the present invention exhibits an osteoclastogenesis-inhibitory activity 
and is useful as an agent for the treatment and improvement of diseases involving a decrease in the amount of bone 
such as osteoporosis, other diseases resulting from bone metabolism abnormality such as rheumatism or degenerative 

45 joint disease, and multiple myetoma. The protein is further useful as an antigen to establish antibodies useful for an 
immunological diagnosis of such diseases. 

NOTE ON MICROORGANISM 

50 Depositing Organization: 

The Ministry of International Trade and Industry, National Institute of Bioscience and 
Human Technotogy. Agency of Industrial Science and Technology 
Address: 1-3, Higashi-1-Chome, Tsukuba-shi, Ibaraki-ken, Japan 

Date of Deposition: June 21, 1995 (originally deposited on June 21, 1995 and transfened to tiie international 
55 deposition according to ttie Budapest Treaty on October 25, 1 995) 

/Accession No. PERM BP-5267 
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TABLE OF SEQUENCES 

Sequence number: 1 
Length of sequence: 1316 
Sequence Type: nucleic acid 
Strandedness : doxible 
Topology: linear 

Molecular type: genomic DNA (human OCIF genomic DNA-1) 



Sequence: 



CTtiGAGACAT 


ATAACHGAA 


CACnCGCCC 


TCATGGGGAA 


GCAGCTCTGC 


AGGGACTTTT 


60 


TCACCCATCT 


CTAAACAATT 


TCAGTGCCAA 


CCCGCCAACT 


GTAATCCATG 


AATGGCACCA 


120 


CACTTTACAA 


GTCATCAA6T 


CTAACTTCTA 


GACCACGCAA 


TTAATCCCCG 


ACACAGCCAA 


180 


CCCTACAGCA 


AAGTGCCAAA 


CTTCTCTCGA 


TAGCTTGAGG 


CTAGTCGAAA 


GACCTCGACC 


240 


AGCCTACTCC 


AGAACnCAG 


CGCCTAGGAA 


GCTCCGATAC 


CAATAGCCCT 


TTCATGATGG 


300 


TGCGGrrCGT 


GAAGGGAACA 


CTGGTCCGCA 


AGGTTATCCC 


T6CCCCAGGC 


AGTCCAATTT 


360 


TCACTCTGCA 


GATTCTCTCT 


GGCTCTAACT 


ACCCCAGATA 


ACAACGACTG 


AATGCAGAAT 


4Z0 


AGCACGGCCT 


HAGGGCCAA 


TCAGACATTA 


CnAGAAAAA 


nCCTACTAC 


ATGGlllATG 


480 


TAAACTTCAA 


6ATCAATGAT 


TCCGAACTCC 


CCGAAAACGG 


CTCACACAAT 


GCCATGCATA 


540 


AAGAGGCCCC 


CTCTAATTTG 


AGCTTTCACA 


ACCCGAACT6 


AACGGCTCAG 


GCACCCGGGT 


600 


ACGCCGGAAA 


CTCACAGCTT 


TCCCCCACC6 


AGACGACAAA 


GGTCTCGCAC 


ACACTCCAAC 


660 


TGCGTCCGGA 


TCnCGCTCC 


ATCGGACTCT 


CAGGCTGGAC 


GAGACACAAC 


CACACCACCT 


720 


CCCCAGCGTC 


TCCCCAGCCC 


TCCCACCGCT 


GGTCCCGCCT 


GCCAGCACCC 


TGCCCGCTCG 


780 


CGGCAACCCG 


CCGGGAAACC 


TCACAGCCCC 


GCGGACACAG 


CAGCCCCCTT 


GTTCCTCACC 


840 


CCGGTGCCTT 


TTmrcccc 


TGCTCTCCCA 


GCGGACACAC 


ACCACCGCCC 


CACCCCTCAC 


900 


GCCCCACCTC 


CCTGGCGGAT 


CCTTTCCCCC 


CCACCCCTCA 


AAGCGHAAT 


CCTGGAGCrr 


960 


TCTCCACACC 


CCCCCACCGC 


TCCCGCCCAA 


GCITCCTAAA 


AAAGAAAGGT 


CCAAAGTrrG 1020 


CTCCACCATA 


CAAAAATCAC 


TGATCAAAGG 


CACGCCATAC 


TTCCTCnCC 


CGCGACCCTA 1080 


TATATAACCT 


CATCACCGCA 


CCCGCTGCGG 


AGACCCACCG 


CACCCCTCCC 


CCACCCGCCC 1140 
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CCTCCAAGCC CCTGAGGTTT CCGGCGACCA CA ATC AAC AAC TTC CTC TCC TCC 1193 

Net Asa Lys Leu Leu C7S Cys 
-20 -15 

GCG CTC GTG GTAACTCCCT GGGCCA6CCC ACGCGTGCCC GGCGCCTCGG 1242 
Ala Leu Val 

GACCCTGCT6 CCACCTCCTC TCCCAACCTC CCAGCGCACC GGCGGGGACA ACCCTCCACT 1302 
CGCTCCCTCC CAGG 1316 

Sequence number: 2 
Length of sequence: 9898 
Sequence Type: nucleic acid 
Strandedness : double 
Topology: linear 

Molecular type: genomic DNA (human OCIF genomic DNA-2) 
Sequence : 

GCnACTTTG TGCCAAATCT CAmGGCTT AAGCTAATAC AGGACTTTCA CTCAAATGAT 60 
ACTGTTGCAC ATAAGAACAA ACCTATTTTC ATGCTAACAT GATGCCACTG TGTTCCTTTC 120 
TCCTTCTAC TTT CTC CAC ATC TCC ATT AAC TGC ACC ACC CAC GAA ACC TTT 171 
Phe Leu Asp He Ser He Lys Trp Thr Thr Gin Glu Tlur Pbe 
-10 -5 1 

CCT CCA AAG TAC CTT CAT TAT GAC GAA GAA ACC TCT CAT CAG CTG TTG 219 
Pro Pro Lys Tyr Leu His Tyr Asp Glu Glu Thr Ser His Gin Leu Leu 
5 10 .15 
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TGT GAC AM TGT CCT CCT GGT ACC TAC CTA AAA CAA CAC TCT ACA CCA 267 
Cys Asp Lys Cys Pro Pro Cly thr Tyr Leu Lys Gin His Cys Thr Ala 
20 25 30 35 

AAG TCG AAC ACC GTG TGC CCC CCT TGC CCT GAC CAC TAC TAC ACA GAC 315 
Lys Trp Lys Thr Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp 
40 45 50 

AGC TGC CAC ACC AGT CAC GAG TCT CTA TAC TGC ACC CCC GTG TGC AAG 363 
Ser Trp His Thr Ser Asp Clu Cys Leu Tyr Cys Ser Pro Yal Cys Lys 
55 60 65 

GAG CTG CAC TAC GTC AAG CAG GAG TGC AAT CCC ACC CAC AAC CGC GIG 411 
Clu Leu GlQ Tyr Val Lys Cln Glu Cys Asn Arg Thr His Asa Ar; Val 
70 75 80 

TGC GAA TGC AAG CAA GGC CCC TAC CH GAG ATA GAG HC TGC HC AAA 459 
Cys Clu Cys Lys Clu Cly Arg Tyr Leu Clu lie Clu Phe Cys Leu Lys 
85 90 95 

CAT ACG ACC TGC CCT CCT CCA TTT CCA CTG GTG CAA CCT C CTACGTGTCA 509 
His Arg Ser Cys Pro Pro Cly Phe Cly Val Val Cln Ala 
100 105 110 

ATGTGCAGCA AAAHAATTA GGATCATCCA AAGTCAGATA GTTCTGACAG TTTAGGACAA 569 
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CACTTTTCn CTCATCACAT TATACCATAC CAAATTCCAA AGGTAATCAA ACCTCCCACC 629 
TAGGTACTAT CTCTCTGCAG TCCTTCCAAA GGACCATTGC TCAGACGAAT ACTTTCCCAC 689 
TACAGCCCAA niAATGACA AATCTCAAAT GCACCAAATT ATTCTCTCAT CAGATCCATC 749 
ATCGTITTTT TTTTTTTTTT TAAAGAAACA AACTCAACTT CCACTAnCA TAGTTGATCT 809 
ATACCTCTAT ATTTCACTTC ACCATCCACA CCTTCAAACT GCAGCACTH HGACAAACA 869 
TCAGAAATCT TAATTTATAC CAAGACACTA AmTCCTCA TATTAATCAG ACTCTCGAGT 92» 
GCTAACAATA AGCACTTATA ATTAATTATG TAAAAAAT6A GAATGGTCAG GCGAAHCCA 989 
TTTCATTATT AAAAACAAGG CTAGnCTTC CmAGCATG CGAGCTGAGT CmGGCACG 1049 
GTAAGGACTA TAGCACAATC TCTTCAAIGA GCTTATTCTT TATCTTAGAC AAAACACATT 1109 
CrCAACCCAA GAGCAAGCAC TTCCCTATAA ACCAAGTGCT nCTCTTTTG CATTTTCAAC 1169 
AGCATTGGTC AGGGCTCATG TCTATTCAAT CTTTTAAACC AGTAACCCAC GTTmTTTC 1229 
TGCCACATTT GCGAACCTTC ACTCCAGCCT ATAACTTTTC ATAGCTTGAG AAAAHAAGA 1289 
GTATCCACn ACmGATGG AAGAACTAAT CAGTATAGAT TCTGATGACT CAGTrTCAAG 1349 
CAGTGTTTCT CAACTCAAGC CCTGCTGATA TTnAACAAA TATCTCGAn CCTAGGCTCG 1409 
ACtCCTTTTT CTGGGCA6CT CTCCTGCGCA TOTAGAATT TTCGCAGCAC CCCTGCACTC 1469 
TAGCCACTAG ATACCAATAG CAGTCCTTCC CaATGTGAC AGCCAAAAAT CTCTTCAGAC 1529 
ACTGTCAAAT CTCGCCACGT CGCAAAATCA CTCCTCGTTG AGAACAGGGT CATCAATCCT 1589 
AACTAICTGT AACTATTTTA ACTCTCAAAA CTTGTGATAT ACAAAGTCTA AAnATTAGA 1649 
CCACCAATAC nTAGGTTTA AAGGCATACA AATGAAACAT TCAAAAATCA AAATCTAm 1709 
TGmCTCAA ATACTGAATC TTATAAAAn AATCACACAA GATGCAAATT GCATCAGACT 1769 
CCCnAAAAT TCCTCTTCCT ATGACTATn GAGCGAGGAA TTGGTGATAC TTCCTACTTT 1829 
CTATTGGATG 6TACTTTCAG ACTCAAAAGC TAAGCTAAGT TGTGTGTCTC TCAGGGTGCG 1889 
GGGTGTGGAA TCCCATCACA TAAAAGCAAA TCCATGTAAT TCAHCAGTA ACTIGTATAT 1949 
GTAGAAAAAT GAAAAGTGGC CTATCCAGCT TGCAAACTAG AGAATTfTGA AAAATAATGC 2009 
AAATCACAAC GATCTTTCTT AAATAAGTAA GAAAATCTGT TTGTAGAATG AAGCAACCAG 2069 
GCA6CCAGAA GACTCAGAAC AAAACTACAC ATTTTACTCT GTGTACACTG GCACCACAGT 2129 
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CGCATTTAn TACCTCTCCC TCCCTAAAAA CCCACACAGC GCnCCTCTT CCGAAArAAC 2189 
AGCrrrCCAG CCCAAAGACA ACCAAAGACT ATGTCGTGTT ACTCTAAAAA CTATTTAATA 2249 
ACCCTmCT TCTTCCTGTT CCTGTnTGA AATCAGAHG TCTCCTCTCC ATATTTTATT 2309 
TACnCATTC TCTTAAnCC TGTCGAATTA CTTAGAGCAA GCATGGTCAA nCTCAACTC 2369 
TAAAGCCAAA TnCTCCATC ATTATAATTT CACATTTTGC CTG6CACGTT ATAATTTTTA 2429 
TAmCCACT CATA{?rAATA ACCTAAAATC ATTACnACA TCGATACATC mTTCATAA 2489 
AAACTACCAT CAGTTATAGA CGGAAGTCAT GTTCATGTTC AGGAAGGTCA TTAGATAAAC 2549 
CnCTGAATA TATTATCAAA CATTACTTCT GTCATTCTTA GATTCITTTT CnAAATAAC 2609 
TTTAAAAGCT AACmCCTA AAAGAAATAT CTCACACATA TCAACTTCTC AnACGATGC 2669 
ACCACAACAC CCAACCCACA CATATGTATC TCAACAATCA ACAACATTCT TAGGCCCGGC 2729 
ACGGTGGCTC ACATCTCTAA TCTCAAGACT TTGACAGGTC AACCCCGCCA GATCACCTCA 2789 
GCTCAGGAGT TCAAGACCAC CCTGCCCAAC ATCATGAAAC CCTGCCTCTA CTAAAAATAC 2849 
AAAAAmGC AGGGCATCCT GCTCCATGCC TCCAACCCTA GCTACTCAGG ACGCTCAGAC 2909 
ACGAGAATCT CTTCAACCCT CGACCCC6AG GnGTGCTGA CCTGACATCC CTCTACTCCA 2969 
CTCCAGCCTG CCTGACACAG ATGAGACTCC CrTCCCTGCCC CCGCCCCCCC CTTCCCCCCC 3029 
AAAAAGAnC TTCTTCATGC ACAACATACC GCAGTCAACA AAGGGAGACC TGCCTCCAGG 3089 
TCTCCAACTC ACTTATnCG AGTAAATTAC CAATGAAAGA ATGCCATGCA ATCCCTGCCC 3149 
AAATACCTCT CCTTATGATA TTCTACAATT TGATATAGAC TTGTATCCCA TTTAACGAGT 3209 
ACGATGTAGT AGCAAACTAC TAAAAACAAA CACACAAACA CAAAACCCTC TTTCCmCT 3269 
AAGCTGCnC CTAAGATAAT GTCAGTCCAA TCCTCCAAAT AATAITTAAT AT6TGAACGT 3329 
nTAGGCTGT GTriTCCCCT CCTGTTCTTT TTTTCTCCCA CCCCTTTGTC ATTTTTGCAC 3389 
GTCAATGAAT CATGTAGAAA GAGACAGGAG ATGAAACTAG AACCACTCCA TTTTCCCCCT 3449 
nmTAm TCTGCrmC CTAAAACATA CAATGAGCTA GGAGCTTGAC AmATAAAT 3509 
GAAGTTTAAT AACmCTGT AGCTTTGATT TTTCTCTTTC ATATTTGTTA TCTTGCATAA 3559 
GCCACAATTG CCCTCTAAAA TCTACATATC CATATTCAAC TCTAAATCTC TTCAACTACC 3629 
nACACTAGA TCGACATAH TTCATATTCA GATACACTGG AATCTATGAT CTAGCCAT6C 3689 
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CTAATATAGT CAACTCTTTG AAGCTATTTA TTTTTAATAG CCTCmACT TCTGCACTCC 3749 

TTCAAcrrrr tctgccaatg atttcttcaa atttatcaaa TAmrrccA tcatcaacta 3809 

AAATCCCCn CCAGTCACCC TTCCTCAACT TTCAACGACT CTGCTCTTn AAACACTTTA 3869 
ACCAAATGGT ATATCATCTT CCGTTTACTA TCTACCTTAA CTGCACGCTT ACGCTTTTCA 3929 
GTCACCCCCC AACTTTATTC CCACCTTCAA AACTTTATTA TAATCTTCTA AATTTTTACT 3989 
TCTCAACGTT ACCATACnA CCACTTCCTT CACAATTAGC ATTCACGAAA GAAACAACTT 4049 
CACTAGCAAC TCATTCGAAT TTAATGATCC AGCATTCAAT GCCTACTAAT nCAAACAAT 4109 
GATAnACAG CACACACACA GCAGmTCT TCATTTTCTA GGAATAATTG TATGAAGAAT 4169 
ATCGCTCACA ACACCCCCTT ACTCCCACTC AGCGGACGCT GCACTAATt^ ACACCCTACC 4229 
CTTCTTTCCT TTCCTCTCAC AmCATGAC CCTTTTCTAC CTAACGAGAA AAnCACTTC 4289 
CATTTGCAn ACAAGGAGCA GAAACTCGCA AACGGGATCA TG6TCCAACT TTTGTTCTGT 4349 
CTAATGAAGT CAAAAATGAA AATCCTAGAG TTTTGTGCAA CATAATACTA GCACTAAAAA 4409 
CCAAGTGAAA ACTCinCCA AAACTGTGTT AACACCGCAT CTGCTGGCAA ACGATTTCAC 4469 
CAGAAGCTAC TAAATTCCTT GCTATTTTCC GTAC GA ACC CCA GAG CCA AAT ACA 4523 

Gly "nir Pro Glu Atk Asq Thr 
115 

GTT TGC AAA AGA TCT CCA CAT CGC TTC TTC TCA AAT GAG ACG ICA TCT 45n 
Val Cys Lys Arg Cys Pro Asp Gly Phe Phe Ser Asn Glu Thr Ser Ser 
120 125 130 135 

AAA CCA CCC TCT AGA AAA CAC ACA AAT TGC ACT GTC TTT GGT CTC CTG 4619 ^ 
Lys Ala Pro Cys Arg Lys His Thr Asa Cys Ser Val Phe Gly Leu Leu 
140 145 150 

CTA ACT CAC AAA GGA AAT CCA ACA CAC CAC AAC ATA TCT TCC GGA AAC 4667 
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Leu Thr Gin Lys Cly Asn Ala Thr His Asp Aso He Cys Ser Cly Asn 
155 160 165 

ACT GAA TCA ACT CAA AAA TCT CCA ATA G GTAATTACAT TCCAAAATAC 4715 
Ser CIu Ser Thr Cla Lys Cys Gly He 
170 175 

GTCrnCTAC GATTTTGTAG TATCATCTCT CTCTCTGAGT TGAACACAAG GCCTCCACCC 4775 
ACATTCTTCC TCAAACTTAC ATTTTCCCn TCTTCAATCT TAACCACCTA ACCCTACTCT 4835 
CCATGCAm CTGCTAAAGC TACCACTCAC AATCTCTCAA AAACTCATCT TCTCACACAT 4895 
AACACCTCAA AGCTTCATTT TCTCTCCTTT CACACTCAAA TCAAATCTTC CCCATACCCA 4955 
AAGGGCA6T6 TCAACTHGC CACTGACATG AAAHAGGAG AGTCCAAACT 6TAGAATTCA 5015 
CCTTCTCTGT TAnACTTTC ACGAATGTCT GtAnAHAA CTAAAGTATA TATTGGCAAC 5075 
TAAGAAGCAA AGTGATATAA ACATCATCAC AAAnAGGCC ACGCATGGTG GOTACTCCT 5135 
ATAATCCCAA CATTTTGGGG CGCCAAGCTA GCCAGATCAC TTGAGCTCAC GATTTCAACA 5195 
CCAGCCTGAC CAACATGGTG AAACCTTGTC TCTACTAAAA ATACAAAAAT TACCTGGGCA 5255 
TGGTAGCAGG CACTTCTAGT ACCAGCTACT CAGCGCTGAG GCAGGACAAT CGCTTGAACC 5315 
CACCAGATGG ACGTTCCACT GACCTCAGAT TCTACCACTG CACTCCACTC TGCGCAACAC 5375 
AGCAAGArrr CATCACACAC ACAQCACAC ACACACACAC ACACAHAGA AATGTGTACT 5435 
TCGCTTTCn ACCTATCCTA HAGTCCATC TATTGCATGC AACTTCCAAC CTACTCTGCT 5495 
TGICHAAGC TCTTCATTGC GTACACGTCA CTACTAHAA GnCACCm TTCCGATCCA 5555 
TTCCACGCTA CTCATGACAA HCATCAGGC TACTCTGTCT GTTCACCTTG TCACTCCCAC 5615 
CACTAGACIA ATCTCACACC nCACTCAAA GACACATTAC ACTAAAGATG ATTTCCTTTT 5675 
TTCTGTrrAA TCAAGCAATC GTATAAACCA GCnCACTCT CCCCAAACAG TTTTTCGTAC 5735 
TACAAACAAG TTTATGAACC ACAGAAATCT CAATTGATAT ATATATGACA nCTAACCCA 5795 
CTICCAGCAT TGTTTCATTC TCTAATTGAA ATCATACACA ACCCATTHA CCCTTTCCTT 5855 
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TOTATCTAA AAAAAAAAAA AAAAAAATGA ACGAAGGGGT ATTAAAAGGA GTGATCAAAT 5915 
TTTAACATTC TCTTTAATTA ATTCATTTTT AATTTTACn TTTTTCAm ATTCTGCACT 5375 
TACTATCTCC TACTCTGCTA TACACCCTTT AACATTTATA AAAACACTCT CAAACTTGCT 6035 
TCAGATGAAT ATACGTAGTA CAACGGCAGA ACTAGTAnC AAACCCAGCT CTGATCAATC 6095 
CAAAAACAAA CACCCAHAC TCCCATTnC TGGGACATAC HACTCTACC CAGATCCTCT 6155 
CGCCTTOTA ATGCCTATGT AAATAACATA CTTTTATGn TCGTTAnTT CCTATCTAAT 6215 
CTCTACmT ATATCTCTAT CTATCTCTTC CTTTCnTCC AAACCTAAAC TATCTCTCTA 6275 
AATCTCGCCA AAAAATAACA CACTATTCCA AATTACTCTT CAAAnCCTT TAAGTCACTG 6335 
ATAAnATTT GTTTTGACAT TAATCATCAA CTTCCCTGTC CCTACTAGCT AAACCTTTAA 8395 
TACAATCnA ATCTTTCTAT TCATTATAAC AAmTTCGC TCTTACnAT HACAACAAT 6455 
ATTTCACTCT AAmGACAT TTACTAAACT nCTCTTCAA AACAAT6CCC AAAAAAGAAC 6515 
AHAGAAGAC ACGTAAGCTC AGTTCGTCTC TGCCACTAAC ACCAGCCAAC AGAACCTTGA 6575 
TTTTATTCAA ACTTTCCATT TTACCATATT TTATCTTCGA AAATTCAATT 6T6TTCGTTT 6635 
TtTCTTTTTG TTTGTATTCA ATACACTCTC ACAAATCCAA TTGnGACTA AATCTTCTGG 6695 
GTITTCTAAC CTTTCTTTAC AT GH ACC CTG TGT GAG GAG CCA TTC TTC AGG 6747 

Asp Val Thr Leu Cys Glo Glu Ala Phe Phe Arg 
180 185 

m OCT CTT COT ACA AAC TTT ACG CCT AAC TCG CTT ACT GTC TTC GTA 6795 
Phe Ala Val Pro Thr Lys Phe Thr Pro Asn Trp Leo Ser Val Leu Val 
190 195 200 

GAC AAT TTG CCT GGC ACC AAA GTA AAC GCA GAG AGT GTA GAG AGG ATA 6843 
Asp Asn Leu Pro Gljr Thr Lys Val Asa Ala Glu Ser Val Glo Arg lie 
205 210 215 
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AAA CGG CAA CAC AGC TCA CAA GAA CAG ACT TO CAG CTG CTG AAG TTA 6891 
Lys Arg Gin His Ser Ser Gin Clu Gin Thr Phe Gin Leu Leu Lys Leu 
220 225 230 235 

TGG AAA CAT CAA AAC AAA GAC CAA GAT ATA GTC AAG AAG ATC ATC CAA G 6940 
Trp Lys His Gla Asa Lys Asp Gin Asp lie Val Lys Lys He He CId 
240 245 250 

CTATGATAAT CTAAAATAAA AACATCAATC AGAAATCAAA CACACCTATT TATCATAAAC 7000 
UGGAACAAG ACTCCATGTA TCTTTAGTTC TGTCCATCn" GTTTCCCTCT TCGAATCATT 7060 
GTTCGACTGA AAAAGTTTCC ACCTGATAAT CTAGATCTCA TTCCACAAAC ACHATACAA 7120 
GGTTTTGTTC TCACCCCTGC TCCCCAGTTT CCTTGTAAAG TATGHGAAC ACTCTAAGAG 7180 
AACAGAAATG CATTTGAACC CACGCCTGTA TCTCACGGAC TCGCTTCCAG ATCCOTAAC 7240 
GCnCTCTAA GCACCCCCTC TAGACCACCA AGGA6AACCT CTATAACCAC TTTGTATCTT 7300 
ACATTGCACC TCTACCAA6A ACCTCTGTTG TATTTACTTG GTAATTCTCT CCAGGTAGGC 7360 
TTTTCCTAGC HACAAATAT GTTCnATTA ATCCTCATGA TATGGCCTGC ATTAAAAnA 7420 
TTTTAATCGC ATATGHATG AGAATTAATG ACATAAAATC TGAAAACTCT TTCAGCCTCT 7480 
TCTAGGAAAA ACaAGTTAC AGCAAAATGT TCTCACATCT TATAACTTTA TATAAAGATT 7540 
CTCCTnAGA AATGGTGTGA GAGAGAAACA CAGACAGATA GGGAGAGAAG TGTGAAAGAA 7600 
TCTGAAGAAA AGGAGTTTCA TCCAGKm% ACTGTAAGCT mCCACACA TGATGGAAAG 7660 
ACTTCTGACT TCACTAAGCA TTGCGACGAC ATCCTA6AAG AAAAAGGAAC AACACmCC 7720 
ATAATGCAGA CAGGGTCAGT GAGAAATTCA TTCAGCTCCT CACCAGTACT TAAATGACT6 7780 
TATAGTCTTG CACTACCCTA AAAAACnCA AGTATCTCAA ACCGGGGCAA CAGATTTTAG 7840 
GAGACCAACC TCTTTGAGAC CTCATTGCTT TTGCTTATCC AAACAGTAAA CTTTTATCn 7900 
TTGAGCAAAC CAAAAGTAn CTTTGAACGT ATAATTAGCC CTGAAGCCGA AAGAAAAGAG 7960 
AAAATCAGAG ACCGHACAA TTG6AAGCAA CCAAAnCCC TATTTTATAA ATGAGGACAT 8020 
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TTTAACCCAG AAAGATCAAC CCATTTGGCT TACCCCTCAC ACATACTAAC TGACTCATCT 8M0 
CATTAATACA AATGTTACTT CCTCCCTCTT AGGTTTGTAC CCTAGCTTAT TACTGAAATA 8140 
nCTCTACGC TCTCTCTCTC CTTTACnCC TCCACCTCAT GTCTnGACT TTTCACATAT 8200 
CCTCCTCATC GAGGTAGTCC TCTGGTCCTA TGTCTATTCT TTAAAGGCTA CTTACGGCAA 8260 
TTAACTTATC AACTAGCGCC TACTAATCAA ACTTTGTATT ACAAACTAGC TAACTTCAAT 8320 
ACTTTCCTTT TTTTCTCAAA TGnATCCTC GTAATTTCTC AAACTTTTTC TTACAAAACT 8380 
CAGACTCATC TGTCnATTT TCTACTCnA ATinCAAAA nACGACCTT CTTCCAAACT 8440 
mCTTCCAT CCCAAAAATA TATACCATAT TATCTTAm TAACAAAAAA TATTTATCTC 8500 
AGTrCTTAGA AATAAATCGT CTCACTTAAC TCCCTCICAA AACAAAAGCT TATCATTGAA 8560 
ATATAAmr CAAATTCTCC AAGAACCTTT TCCCTCACCC TTGTnTATC ATCCCATTGG 8620 
ATGAATATAA ATGATGTGAA CACmiCTC CGCTTTTCCT TTATCCAG AT AH CAC 8676 

Asp lie Asp 

CrC TCT GAA AAC AGC GTC CAC CG6 CAC ATT GGA CAT CCT AAC CTC ACC 8724 
Leu Cfs Glu Asa Ser Val GIo Arg His lie Cly His Ala Asa Leu Thr 
255 260 265 270 

TTC GAG CAG CH CCT AGC TK ATG GAA ACC TTA CCG GGA AAC AAA GTG 8772 
Phe Glo GIo Leu Arg Ser Leu Ifet Glu Ser Leu Pro GI7 Lys Lys Yal 
275 280 285 

GGA GCA GAA CAC AU GAA AAA ACA ATA AAG CCA TGC AAA CCC ACT CAC 8820 
Gly Ala Glu Asp He Glu Lys Tlir He Lys Ala Cys Lys Pro Ser Asp 
290 295 300 

CAG ATC CTC AAC CTC CTC ACT TTG TG6 CCA ATA AAA AAT GGC CAC CAA 8868 
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Gin He Leu Lys Lea Leu Ser Leu Trp Arj lie Lys Asn Gly Asp Cln 
305 310 315 

GAG ACC TTC AAC GCC CTA ATG CAC CCA CTA AAC CAC TCA AAG ACG TAC 8916 
Asp Thr Leu Lys Gly Leu Uet His Ala Leu Lys His Ser Lys Thr Tyr 
320 325 330 

CAC TTT CCC AAA ACT GTC ACT CAG AGT CTA AAC AAG ACC ATC ACG TTC 8964 
His Phe Pro Lys Thr Val Thr Gin Ser Leu Lys Lys Thr He Arg Phe 
,335 340 345 350 

CTT CAC AGC TTC ACA ATC TAC AAA TTG TAT CAC AAG TTA m m CAA 9012 
Leu His Ser Phe Thr Het Tyr Lys Leu Tyr CIo Lys Leu Phe Leu Glu 
355 360 365 

ATG ATA GCT AAC CAG GTC CAA TCA CTA AAA ATA AGC TCC UA 9054 
Het He Gly Asd Gin Val Gin Ser Val Lys He Ser Cys Leu 
370 375 380 

TAACTGGAAA TCCCCATTGA GCTGTnCCT CACAAtTGCC CAGATCCCAT CGATGACTAA 9114 
ACTCTnCTC AGGCACTTCA GCCXnCAGT GATATCTTTC TCAmCCAC TCACTAATH 9174 
TCCCACAGGG TACTAAAAGA AACTATGAT6 TGCAGAAACC ACTAACATa CCTCCAATAA 9234 
ACCCCAAATC GTTAATCCAA CTCTCAGATC TGCATCCTTA TCTACTGACT ATATTTTCCC 9294 
TFATOCTGC TTCCAGTAAT TCAACTGCAA AHAAAAAAA AAAAACTACA CTCCACTCCC 9354 
CCTTACTAAA TATCCCAATG TCTAACnAA ATACCTTTGG GATTCCACCT ATGCTAGAGG 9414 
CTTTTATTAC AAACCCATAT TTTmCTCT AAAAGHACT AATATATCTG TAACACTAH 9474 
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ACAGTATTGC TATTTATATT CATTCACAtA TAAfiATTTGC ACATAnATC ATCCTATAAA 9534 
CAAACGCTAT CACTTAATTT TAGAAACAAA ATTATAnCT CTTTAnATG ACAAATCAAA 9594 
CAGAAAATAT ATATTTTTAA TGCAAACTTT CTACCATTTT TCTAATACCT ACTCCCATAT 9654 
rmCTGTCT GCAGTATTTT TATAATTTTA TCTCTATAAG CTGTAATATC ATTTTATACA 9714 
AAATCCATTA TTTAGTCAAT TCTnAATCT TGCAAAACAT ATCAAATAtA AATTATCTGA ^774 
ATATTACATC CTCTCAGAAA TTGAATCTAC CTTATTTAAA AGATTTTAlt? CTTTTATAAC 9834 
TATATAAATC ACATTATTAA ACTTTTCAAA TTATTTmA TTCCTtTCTC TGTTCCTTTT 9894 
ATTT 9898 

Sequence number: 3 
Length of sequence: 401 
Sequence Type: amino acid 
Strandedness : single stranded 
Topology: linear 
Molecular type: protein 

Sequence: 

Het Asn Aso Leu Leu Cjrs Cys Ala Leu Val Pbe Leu Asp He Ser 

-20 -15 -10 

He Lys Trp Thr Thr Gin Glu Thr Phe fro Pro Lys Tyr Leu His 

-5 15 
Tyr Asp CIu Glu Thr Ser His Gin Leu Leu Cys Asp Lys Cys Pro 
10 15 20 

Pro Cly Thr Tyr Leu Lys Gin His Cys Thr Ala Lys Trp Lys Thr 
25 SO 35 

Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp Ser Trp His 
40 45 50 
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. He Gin Asp He Asp Leu Cys Glu Asa Ser Val Gin Arg His He 

250 255 260 

61y His Ala Asn Leu Thr Phe Giu Gin Leu Arg Ser Leu Met Glu ' 

265 270 275 

Ser Leu Pro Gly Lys Lys Yal Gly Ala Glu Asp He Glu Lys Ttir 

280 285 290 

He Lys Ala Cys Lys Pro Ser Asp GIo lie Leu Lys Leu Leu Ser 
« 295 300 305 

Leu Trp Arg He Lys Asn Gly Asp Gin Asp Thr Leu Lys Gly Leu 

310 315 320 

Met His Ala Leu Lys His Ser Lys Thr Tyr His Phe Pro Lys Thr 

325 330 '335 

Val Thr Gin Ser Leu Lys Lys Thr He Arg Phe Leu His Ser Phe 

340 345 350 

Thr Met Tyr Lys Leu Tyr GIo Lys Leu Phe Leu Glu Met He Gly 

355 360 365 

Asa GIo Val Gin Ser Val Lys He Ser Cys Leu 

370 375 380 

Sequence nvnnber: 4 
Length of sequence: 1206 
Sequence Type: nucleic acid 
Strandedness: single stranded 
Topology: linear 
Molecular type: cDNA 
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Sequence : 

ATCMCAACT TCCTCTCCTC CCCCCTCCTC rTTCTCGACA TCTCCAHAA CTCCACCACC 60 
CAGGAAACCT TTCCTCCAAA CTACCTTCAT TATGACGAAC AAACCTCTCA TCAGCTCTTC 120 
TGTGACAAAT GTCCTCCTGG TACCTACCTA AAACAACACT GTACACCAAA GTGGAAGACC 180 
CTGTCCGCCC CTTGCCCTGA CCACTACTAC ACAGACAGCT GGCACACCAG TGACGACTCT 240 
CTATACTGCA CCCCC6TCTG CAACGAGCTG CAGTACCTCA AGCAGGACTG CAATCGCACC 30O 
CACAACCCCG TGTGCGAATG CAAGCAAGGG CGCTACCTTC AGATAGAGTT CTCCTTGAAA 360 
CATACGAGCT GCCCTCCTGG AITTGGAGTG GTGCAAGCTG GAACCCCAGA GCGAAATACA 420 
GirrCCAAAA CATGTCCACA TCCCTTCnC TCAAATGACA CCTCATCTAA ACCACCCTGT 480 
AGAAAACACA CAAATTGCAG TGTCTTTCGT CTCCTCaAA CTCAGAAAGG AAATGCAACA 540 
CACGACAACA TATGTTCCGG AAACAGTGAA TCAACTCAAA AATGTGCAAT AGATGTTACC 600 
CTGTCTGAGG ACGCATTCn CAGCTTTGCT GTTCCTACAA ACTTTACCCC TAACTCGCTT 660 
AGTCTCTTGG TAGACAATTT CCCTGGCACC AAAGTAAACG CAGAGAGTGT AGAGACGATA 720 
AAACGGCAAC ACAGCTCACA AGAACAGACT nCCAGCTGC TGAACTTATC CAAACATCAA 780 
AACAAAGACC AAGATATACT CAAGAAGATC ATCCAAGATA HGACCTCTG TGAAAACACC 840 
GTCCAGCGGC ACATTCGACA TGCTAACCTC ACCTTCGAGC AGCTTCGTAG CTTGATGGAA 900 
AGCTTACCGG GAAAGAAAGT CCGAGCACAA GACAHGAAA AAACAATAAA GGCATGCAAA 960 
CCCACT6ACC AGATCCTGAA GCTCCTCAGT TTGTGGCGAA TAAAAAATGG CGACCAAGAC 1020 
ACCTTGAAGG GCCTAATGCA CGCACTAAAG CACTCAAAGA CGTACCACn TCCCAAAACT 1080 
GTCACTCAGA GTCTAAAGAA CACCATCACG TTCCTTCACA GCnCACAAT GTACAAAHG 1140 
TATCAGAACT TATTTTTACA AATGATAGGT AACCACCTCC AATCAGTAAA AATAAGCTGC 1200 
mTAA 1206 
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SEQUBNCB LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: SNOW BRAND MILK PRODUCTS CO., LTD 

(B) STREET: 1-1, NAEBOCHO 6-CHC»!E 

(C) CITY: HIGASHI-KD. SAPPORO-SHI 

(D) STATE: HOKKAIDO 

(E) COUNTRY: Jp 

(P) POSTAL CODE (ZIP) : NONE 

11'^ Zb'^^°^'- ^"^^ ^^OCESS FOR PRBPARING PROTEIN 
(ill) NUMBER OP SEQUENCES: 4 

(Iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC coinpatlble 

(C) OPERATING SYSTEM: PC - DOS/MS - DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 (EPO) 

(V) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 97935810.8 
fvi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 235928/96 
(B> FILING DATE: 19 -AUG- 199 6 

(2) INFORMATION FOR SEQ ID N0:1: 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE8S: double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: genomic DMA (human OCIF genomic DNA-1) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

ATAACTTGAA CACTTGGCOC TGATGGGGAA GCAGCTCTGC AGGGACTTTT 60 

l^^^l^ GTAAACAATT TCAGTGGCAA CCCGCGAACT GTAATCCATG AATGGGACCA 120 

GTCATCAAGT CTAACTTCTA GACCAGGGAA TTAATGGGGG AGACAGCGAA 180 

COTn^GAGCA AAGTGCCAAA CTTCTGTCGA TAGCTTGAGO CTAGTGGAAA GACCTCGAGG 240 

^^^f^ AGAAGTTCAG CGC6TAGGAA GCTCOSATAC CAATAGCCCT TT6ATGATGO 300 

TGGGGTTGOT GAAGGGAACA GTGCTCCGCA AGGTTATCCC TGCCCCA6GC AGTCCAATTT 360 

IS^^^^ GATTCTCTCT GGCTCTAACT ACCCCAGATA ACAAGGAOTG AATGCAGAAT 420 ' 

J^^^Sf?^ TTAGGGCCAA TCAGACATTA GTTAGAAAAA TTCCTACTAC ATGGTTTATG 480 

AAGAGGGOCC CTGTAATTTG AOT ^ 

^S™^™^^^''*^ 660 
TGC6TCC6GA TCTTGGCTG6 ATCGGACTCT CAGGGTGGAG GAGACACAAG CACAGCAGCT 720 
G^GCGTG TGCCCAGCCC TCCCACCGCT GGIX^^ 780 

840 

^^^S^ 900 
G^CCACCTC CCTGGGGGAT CCI^ 960 

TCTGCACAOC CCCOGACCGC TCCCGCCCAA GCTTCCTAAA AAAGAAAGGT GCAAAGTTTG 1020 

GTCCAGGATA GAAAAATGAC T6ATCAAAGG CAGGCGATAC TTCCTGTTGC CGGGACGCTA 1080 

TATATAACGT GATGAGCGCA CGGGCTGCGG AGACGCACCG GAGCGCTCGC CCAGCCGCCG 1140 

CCTCCAAGOC CCTGAGGTTT CCGGGGACCA CA ATG AAC AAG TTG CTG TGC TGC 1193 

Met Asn Lys Leu Leu cys Cys 

-20 .15 

GCG CTC GTG GTAAGTCCCT GGGCCAOCCG ACGGGTGCCC GGCGCCTGGG 1242 
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Ala Leu Val 

GAGGCTGCTG CCACCTGGTC TCCCAACCTC CXAGCGGACC GGCG6GGAGA AGGCTCCACT 1302 
CGCTCCXrrCC CAQG 1316 



(2) INFORMATION FOR SBQ ID NO: 2: 

(I) SBQUBNCB CHARACTERISTICS: 

(A) LENGTH: 989B base pairs 

(B) TYPE: nucleic acid 

(C) 8TRAHDEDNESS : double 
(D> TOPOLOGY: linear 

(II) MOLECDLB TYPE: genc»nic DNA (human OCIF genomic DNA-2) 

(xi) SEQUENCE DESCRIPTION: SBQ ID N0:2: 

GCTTACTTTG TGCCAAATCT CATTAGGCTT AA6GTAATAC A6GACTTTGA GTCAAATGAT 60 
ACTGTTGCAC ATAAGAACAA ACCTATTTTC ATGCTAAGAT GATGCCACTG TGTTCCTTTC 120 
TCCTTCTAG TTT CTG GAC ATC TCC ATT AAG TGG ACC ACC CAO GAA ACG TTT 171 
Phe Leu Asp lie Ser lie Lys Trp Thr Thr Gin Glu Thr Phe 
-10 -5 1 

CCT CCA AAG TAC CTT CAT TAT GAC GAA GAA ACC TCT C:AT CAG CTG TTG 219 
Pro Pro Lys Tyr Leu His Tyr Asp Glu Glu Thr Ser His Gin Leu Leu 
5 10 15 

TGT <3AC AAA TGT CCT CCT GGT ACC TAC CTA AAA CAA CAC TGT ACA GCA 267 
Cys Asp Lys Cys Pro Pro Gly Thr Tyr Leu Lys Gin His Cys Thr Ala 
20 25 30 35 

AAG TGG AAG ACC 6TG TGC GCC CCT TGC CCT GAC CAC TAC TAC ACA GAC 315 
Lye Trp Lys Thr val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp 
40 45 50 

AGC TGG CAC ACC AGT GAC GAG TGT CTA TAC TGC AGC CCC GT6 TGC AAG 363 
Ser Trp His Thr Ser Asp Glu Cys Leu Tyr Cys Ser Pro Val Cys Lys 
55 60 65 

GAG CTG CAG TAC GTC AAG CAG GAG TGC AAT CGC ACC CAC AAC CGC GTG 411 
Glu Leu Gin Tyr Val Lys Gin Glu Cys Asn Arg Thr His Asn Arg Val 
70 75 80 

TGC GAA TGC AAG GAA GGG CGC TAC CTT GAG ATA GAG TTC TGC TTG AAA 459 
Cys Glu Cys Lys Glu Gly Arg Tyr Leu Glu lie Glu Phe Cys Leu Lys 
85 90 95 

CAT AGG AGC TGC CCT CCT (5GA TTT GGA GTG GTQ CAA GCT G GTACGTGTCA , 509 
His Arg Ser Cys Pro Pro Gly Phe Gly val Val Gin Ala 
100 105 110 

ATGTGCAGCA AAATTAATTA GGATCATGCA AAGTCAGATA GTTGTGACAO TTTAGGAGAA 569 
CACTTTTGTT CTGATGACAT TATAGGATAG CAAATTQCAA AGGTAATGAA ACCTGCCAGG 629 
TAGGTACTPAT GTGTCTC^GAG TGCTTCCAAA (;GACCATTGC TCAGAGGAAT ACTTTGCCAC 689 
TACAGGGCAA TTTAATGACA AATCTCAAAT GCAGCAAATT ATTCTCTCAT GAGATGCATG 749 
ATGGTTTTTT TTX T T T UTTT TAAAGAAACA AACTCAAGTT GCACTATTGA TAGTTGATCT 809 
ATACCTCTAT ATTTCACTTC AGCATOOACA CCTTCAAACT GCAGCACTTT TTGACAAACA 869 
TCAGAAATGT TAATTTATAC CAAGAGAGTA ATTATCCTCA TATTAATGAG ACTCTG6A0T 929 
GCTAACAATA AGCAGTTATA ATTAATTATG TAAAAAATGA GAATGOTGAG GGGAATTCK^l 989 
TTTCATTATT AAAAAC:AAGG CTAGTTCTTC CTTTAGCATG (3GAGCTGAGT (7rTTGGGA(3G 1049 
GTAAGGACTA TAGCAGAATC TCTTCAATGA (3CTTATTCTT TATCTTAGAC AAAACA<5ATT 1109 
<;TCAAGCCAA GAGCAAGCAC TTGCCTATAA ACCAAGTGCT TTCTCTTTTO CATTTTGAAC 1169 
AGCATT6GTC AGGGCTCATG TGTATTC5AAT CTTTTAAACC AGTAACCCAC GTTTTTTTTC 1229 
TGOCACATTT GCGAA(3CTTC A(3T<Xa<KCT ATAACTTTTC ATAGCTTGAG AAAATTAAGA 1289 
GTATCCACTT ACTTAGATGG AA6AAGTAAT CAGTATAGAT TCTGATGACT CAGTTTGAAO 1349 
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I*e\i Thr Gin Lys Gly Asn Ala Thr His Asp Asn lie Cys Ser Gly Asn 
155 160 165 

AGT GAA TCA ACT CAA AAA TGT GGA ATA G GTAATTACAT TCCAAAATAC 4715 

Ser Glu Ser Thr Gin Lys qys Gly lie 
170 175 

GTCTTTGTAC GATTTTGTAG TATCATCTCT CTCTCTGAGT TGAACACAAG GCCTCCAGCC 4775 
ACATTCTTGQ TCAAACTTAC ATTTTCCCTT TCTTGAATCT TAACCAGCTA AGGCTACTCT 4835 
CGATGCATTA CTGCTAAAGC TACCACTCAG AATCTCTCAA AAACTCATCT TCTCACAGAT 4895 
AACACCTCAA AGCTTGATTT TCTCTCCTTT CACACTGAAA TCAAATCTT6 CCCATAGQCA 4955 
AAGGGCAGTO TCAAGTTTGC CACTGAGATG AAATTAGGAG AGTCCAAACT 6TAGAATTCA 5015 
C6TTGTQTGT TATTACTTTC ACGAATGTCT GTATTATTAA CTAAAGTATA TATTGGCAAC 5075 
TAAGAAGCAA AGTGATATAA ACATGATGAC AAATTAGGCC AGGCATGQTG GCTTACTCCT 5135 
ATAATCCCAA CATTTTGGGG GGCCAA6GTA GGCAGATCAC TTGA6GTCAG GATTTCAAGA 5195 
CCAGCCTGAC CAACATGGTG AAACCTTGTC TCTACTAAAA ATACAAAAAT TAGCTGGGCA 5255 
TGGTAGCAGG CACTTCTAGT AOCAGCTACT CAGGGCTGAG GCAGGAGAAT CGCTTGAACC 5315 
CAGGAGATGO AGCTTGCAGT GAGCTGAGAT TGTACCACTG CACTCCAGTC TGGGCAACAG 5375 
AGCAAGATTT CATCACACAC ACACACACAC ACACACACAC ACACATTAGA AATGTGTACT 5435 
TGGCTTT6TT ACCTATGGTA TTAGTGCATC TATTGCATGG AACTTCCAAG CTACTCTGGT 5495 
TGTGTTAA6C TCTTCATTGG GTACAGGTCA CTAGTATTAA GTTCAGGTTA TTC6GATGCA 5555 
TTCCACGGTA GTGATGACAA TTCATCXGOC TAGTGTGTGT GTTCACCTTG TCACTCCCAC 5615 
CACTAGACTA ATCTCAGACC TTCACTCAAA GACACATTAC ACTAAAGATG ATTTG CTTTT 5675 
TTGTGTTTAA TCAAGCAATG OTATAAACCA GCTTGACTCT CCCCAAACAG TTTTTCGTAC 5735 
TACAAAQAAG TTTATGAAGC AGAGAAATGT GAATTGATAT ATATATGAGA TTCTAACCCA 5795 
GTTCCAGCAT TGTTTCATTG TGTAATTGAA ATCATAGACA AQCCATTTTA GCCTTTGCTT 5855 
TCTTATCTAA AAAAAAAAAA AAAAAAATGA AGGAAGGGGT ATTAAAAGGA GTGATCAAAT 5915 
TTTAACATTC TCTTTAATTA ATTCATTTTT AATTTTACTT TTTTTCATTT ATTGTGCACT 5975 
TACTATGTGG TACTGTGCTA TAGAGGCTTT AACATTTATA AAAACACTGT GAAAGTTGCT 6035 
TCAGATGAAT ATAGGTAGTA GAACGGCAGA ACTAGTATTC AAAGCCAGGT CTGATGAATC 6095 
CAAAAACAAA CAOCCATTAC TCCCATTTTC TOOGACATAC TTACTCTACC CAGATGCTCT 6155 
GGGCTTTGTA ATGCCTATGT AAATAACATA GTTTTATGTT TGGTTATTTT CCTATGTAAT 6215 
6TCTACTTAT ATATCTGTAT CTATCTCTTG CTTTGTTTCC AAAGGTAAAC TATGTGTCTA 6275 
AATGTGGGCA AAAAATAACA CACTATTCCA AATTACTGTT CAAATTCCTT TAAGTCAOTG 6335 
ATAATTATTT GTTTTGACAT TAATCATGAA GTTCCCTGTG GGTACTAGGT AAACCTTTAA 6395 
TAGAATGTTA ATGTTTGTAT TCATTATAAG AATTTTTGGC TGTTACTTAT TTACAACAAT 6455 
ATTTCACTCT AATTA6ACAT TTACTAAACT TTCTCTTGAA AACAATGCCC AAAAAAGAAC 6515 
ATTAGAAGAC ACGTAAGCTC AGTTGGTCTC TGCCACTAAG ACCAGCCAAC AOAAGCTTGA 6575 
TTTTATTCAA ACTTTGCATT TTAGCATATT TTATCTTGGA AAATTCAATT GTGTTGGTTT 6635 
TTTGTTTTTG TTTGTATTGA ATAGACTCTC AGAAATCCAA TTGTTGAGTA AATCTTCTGG 6695 
GTTTTCTAAC CTTTCTTTAG AT GTT ACC CT6 TGT GAG GAG GCA TTC TTC AGG 6747 
Asp Val Thr Leu Cys Glu Glu Ala Phe Phe Arg 
180 IBS 

TTT GCT GTT CCT ACA AAG TTT ACG CCT AAC TGG CTT AGT GTC TTG GTA 6795 
Phe Ala Val Pro Thr Lys Phe Thr Pro Asn Trp Leu Ser Val Leu Val 
190 195 200 

GAC AAT TTO CCT GGC ACC AAA GTA AAC GCA GAG AGT GTA GAG AGG ATA 6843 
Asp Asn Leu Pro Gly Thr Lys Val Asn Ala Glu Ser Val Glu Arg lie 
205 210 215 

AAA CGG CAA CAC A6C TCA CAA GAA CA6 ACT TTC CAG CTG CTG AAG TTA 6891 
Lys Arg Gin His Ser Ser Gin Glu Gin Thr Phe Gin Leu Leu Lys Leu 
220 225 230 235 

TGG AAA CAT CAA AAC AAA GAC CAA GAT ATA GTC AAG AAG ATC ATC CAA G 6940 
Trp Lys His Gin Asn Lys Asp Gin Asp He Val Lys Lys He He Gin 
240 245 250 

GTATGATAAT CTAAAATAAA AAGATCAATC AGAAATCAAA GACACCTATT TATCATAAAC 7000 
CAGGAACAAG ACTGCATGTA TGTTTAGTTG TGTGGATCTT GTTTCCCTGT TGGAATCATT 7060 
GTTGGACTGA AAAAGTTTCC ACCTGATAAT GTAGATGTGA TTCCACAAAC AGTTATACAA 7120 
GGTTTT6TTC TCACCCCTGC TCCCCAGTTT CCTTOTAAAG TATGTTGAAC ACTCTAAGAG 7180 
AAGAGAAATG CATTTGAAGG CAGGGCTGTA TCTCAGGGAG TCGCTTCCAG ATCCCTTAAC 7240 
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GCTTCTGTAX GCX6CCCCTC TAGACXACCA AGGAGAAGCT CTATAACCAC TTTGTATCTT 7300 
ACATTGCACC TCTACCAAGA AGCTCTGTTG TATTTACTTG GTAATTCTCT CCAGGTAGGC 7360 
TTTTC G T A GC TTACAAATAT GTTCTTATTA ATCCTCATGA TATGGCCTGC ATTAAAATTA 7420 
TTTTAATGGC ATATGTTATG A6AATTAATG AGATAAAATC TGAAAAGTGT TTGAGCCTCT 7480 
T6TAGGAAAA AGCTAGTTAC AGGAAAATGT TCTCACATCT TATAAGTTTA TATAAAGATT 7540 
CTCCTTTAGA AATGGTGTGA GAGAGAAACA GAGAGAGATA GGGAGAGAA6 TGTGAAAGAA 7600 
TCTGAAGXAA AGGAGTTTCA TCCAGTGTGG ACTGTAAGCT TTACGACACA TGATGGAAAG 7660 
AGTTCTGACT TCAGTAAGCA TTGGGAGGAC ATGCTAGAAG AAAAAGOAAO AAGAGTTTCC 7720 
ATAATGCAOA CAGGGTCAGT GAGAAATTCA TTCAG6TCCT CACCAGTAGT TAAAT6ACTG 7780 
TATAGTCTTG CACTACOCTA AAAAACTTCA AGTATCTGAA ACCGGGGCAA CAGATT TTAG 7840 
GAGACCAACG TCTTTGAGAG CTGATTGCTT TTGCTTATGC AAAGAGTAAA CTTTTATGTT 7900 
TTGAGCAAAC CAAAAGTATT CTTTGAACGT ATAATTAGOC CTGAAGCCGA AAGAAAAGAG 7960 
AAAATCAGAG ACC6TTAGAA TTGGAAOCAA CCAAATTCCC TATTTTATAA ATGAGGACAT 8020 
TTTAACCCAG AAAGATGAAC CGATTTGGCT TAGOGCTCAC AGATACTAAG TGACTCATGT 8080 
CATTAATAGA AAT6TTAGTT CCTCCCTCTT AGGTTTGTAC CCTAGCTTAT TACTGAAATA 8140 
TTCTCTAQGC TGTGTGTCTC CTTTAGTTCC TCGACCTCAT GTCTTTGAGT TTTCAGATAT 8200 
CCTCCTCATG 6AGGTAGTCC TCTGGTGCTA TGTGTATTCT TTAAAGGCTA GTTACGGCAA 8260 
TTAACTTATC MiCTAGCGCC TACTAATGAA ACTTPGTATT ACA AAGTAGC TAACTTGAAT 8320 
ACTTTCCTTT TTTTCTGAAA TGTTATGGTG GTAATTTCTC AAACTTTTTC TTAGAAAACT 8380 
GAGAGTGATG TOTCTTATTT TCTACTGTTA ATTTTCAAAA TTAGGAGCTT CTTCCAAAGT 8440 
TTTGTTG6AT GCCAAAAATA TATAGCATAT TATCTTATTA TAACAAAAAA TATTTATCTC 8500 
AGTTCTTAGA AATAAATGGT GTCACTTAAC TCCCTCTCAA AAGAAAAGGT TATCATTGAA 8560 
ATATAATTAT GAAATTCTGC AAGAACCTTT TGCCTCACGC TTGTTTTATG ATGGCATT60 8620 
AT6AATATAA ATGATGTGAA CACTTATCTG GGCTTTTGCT TTATGCAO AT ATT GAC 8676 

Asp He Asp 



CTC TGT GAA AAC AGO GTG CAG CGG CAC ATT GGA CAT GCT AAC CTC ACC 8724 
Leu Cys Glu Asn Ser Val Gin Arg His He Gly His Ala Asn Leu Thr 
255 260 265 270 

TTC GAG CAG CTT CGT AGO TTO ATG GAA AGC TTA CCG GGA AAG AAA GTG 8772 
Phe Glu Gin Leu Arg Ser Leu Met Glu Ser Leu Pro Gly Lys Lys Val 
275 280 285 

GGA GCA GAA dAC ATT GAA AAA ACA ATA AAG GCA TGC 3^AA CCC AGT GAC 8820 
Gly Ala Glu Asp He Glu Lys Thr He Lys Ala Cys Lys Fro Ser Asp 
290 295 300 

CAG ATC CTG AAG CTO CTC AGT TTG TGG CGA ATA AAA AAT GGC GAC CAA 8868 
Gin. He Leu Lys Leu Leu Ser Leu Trp Arg He Lys Asn Gly Asp Gin 
305 310 315 

GAC ACC TTG AAG GGC CTA ATG CAC GCA CTA AAG CAC TCA AAG ACG TAC 8916 
ASP Thr Leu Lys Gly Leu Met His Ala Leu Lys His Ser Lys Thr Tyr 
320 325 330 

CAC TTT CCC AAA ACT GTC ACT CAG AGT CTA AAG AAG ACC ATC AGG TTC 8964 
His Phe Pro Lys Thr Val Thr Gin Ser Leu Lys Lys Thr He Arg Phe 
335 340 345 350 

CTT CAC AGC TTC ACA ATG TAC AAA TTG TAT CAG AAG TTA TTT TTA GAA 9012 
Leu His ser Phe Thr Met Tyr Lys Leu Tyr Gin Lys Leu Phe Leu Glu 
355 360 365 

ATG ATA GGT AAC CAG GTC CAA TCA GTA AAA ATA AGC TGC TTA 9054 
Met He Gly Asn Gin Val Gin Ser Val Lys He Ser Cys Leu 
370 375 380 

TAACTGGAAA TGGCCATTGA GCTGTTTCCT CACAATTGGC GAGATCCCAT GGATGAGTAA 9114 
ACTGTTTCTC A6GCACTTGA GGCTTTCAGT GATATCTTTC TCATTACCAG TGACTAATTT 9174 
T6CCACAGGG TACTAAAAGA AACTATGATQ TGGA6AAAGG ACTAACATCT CCTCCAATAA 9234 
ACCCCRAATG GTTAATCCAA CTGTCA6ATC TG6ATCGTTA TCTACTGACT ATATTTTCCC 9294 
TTATTACTGC TTGCAGTAAT TCAACTGGAA ATTAAAAAAA AAAAACTAGA CTCCACTGGG 9354 
CCTTACTAAA TATGGGAATG TCTAACTTAA ATAGCTTTGG GATTCCAGCT ATGCTAGAGG 9414 
CTTTTATTAG AAAGCCATAT TTTTTTCTGT AAAAGTTACT AATATATCTG TAACACTATT 9474 
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ACAOTATTGC TATTTATATT CATTCAGATA TAAGATTTGO ACAXATTATC ATCCTATAAA 9534 
GAAAC6GTAT GACTTAATTT TAQAAAGAAA ATTATATTCT GTTTATTATG ACAAATGAAA 9594 
GAGAAAATAT ATATTTTTAA TOGAAA6TTT GTAGCATTTT TCTAATAGGT ACTGCCATAT 9654 
TTTTCTGTGT GGAGTATTTT TATAATTTTA TCTGTATAAG CTGTAATATC ATTTTATAGA 9714 
AAATGCATTA TTTAGTCAAT TGTTTAATGT TGGAAAACAT ATGAAATATA AATTATCTGA 9774 
ATATTAGATG CTCTGAQAAA TT6AATGTAC CTTATTTAAA AGATTTTATG GTTTTATAAC 9834 
7ATATAAATG ACATTATTAA AGTTTTCAAA TTATTTTTTA TTGCTTTCTC T6TTGCTTTT 9894 
ATTT 9898 



(2) IHFORMATIOM FOR SEQ ID NO: 3: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 401 a^no acids 

(B) TYPE: amino acid 

(C) STRANDBDNBSS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



Met Asn Asn Leu 


Leu 


Cys Cys 


Ala Leu Val 


Phe Leu Asp He Ser 


-20 




-15 




-10 


He Lys Trp Tlir 


Thr 


Gin Glu 


Thr Phe Pro 


Pro Lys Tyr Leu His 


-5 




1 




5 


Tyr Asp Glu Glu 


Thr 


Ser His 


Gin Leu Leu Cys Asp Lys Cys Pro 


10 




15 




20 


Pro Gly Thr Tyr 


Leu 


Lys Gin 


His Cys Thr 


Ala Lys Trp Lys Thr 


25 




30 




35 


Val Cys Ala Pro 


Cys 


Pro Asp 


His Tyr Tyr Thr Asp Ser Trp His 


40 




45 




50 


Thr Ser Asp Glu 


Cys 


Leu Tyr 


cys Ser Pro Val Cys Lys Glu Leu 


55 




60 






Gin Tyr Val Lys 


Gin Glu Cys Asn Arg Thr His Asn Arg Val Cys 


70 




75 




80 


Glu Cys Lys Glu 


Gly Arg Tyr Leu Glu He Glu Phe Cys Leu Lys 


85 




90 




95 


His Arg Ser Cys 


Pro Pro Gly Phe Gly Val Val Gin Ala Gly Thr 


100 




105 




110 


Pro Glu Arg Asn 


Thr Val Cys 


Lys Arg Cys 


Pro Asp Gly Phe Phe 


115 




120 




125 


Ser Asn Glu Thr 


Ser 


Ser Lys 


Ala Pro Cys Arg Lys His Thr Asn 


130 




135 




140 


Cys Ser Val Phe 


Gly Leu Leu Leu Thr Gin Lys Gly Asn Ala Thr 


145 




150 




155 


His Asp Asn He 


Cys 


Ser Gly Asn Ser Glu 


Ser Thr Gin Lys Cys 


160 




165 




170 


Gly He Asp Val 


Thr 


Leu Cys 


Glu Glu Ala 


Phe Phe Arg Phe Ala 


175 




180 




185 


Val Pro Thr Lys 


Phe 


Thr Pro 


Asn Trp Leu 


Ser Val Leu Val Asp 


190 




195 




200 


Asn Leu Pro Gly 


Thr 


Lys Val 


Asn Ala Glu 


Ser Val Glu Arg He 


205 




210 




215 


Lys Arg Gin His 


Ser 


Ser Gin 


Glu Gin Thr 


Phe Gin Leu Leu Lys 


220 




225 




230 


Leu Trp Lys His 


Gin 


Asn Lys 


Asp Gin Asp 


He val Lys Lys He 


235 




240 




245 


He Gin Asp He 


Asp 


Leu Cys 


Glu Asn Ser Val Gin Arg His He 


250 




255 




260 


Gly His Ala Asn 


Leu 


Thr Phe 


Glu Gin Leu Arg Ser Leu Met Glu 


265 




270 




275 


Ser Leu Pro Gly 


Lys 


Lys Val 


Gly Ala Glu Asp He Glu Lys Thr 


280 




285 




290 


He Lys Ala Cys 


Lys 


Pro Ser 


Asp Gin He 


Leu Lys Leu Leu Ser 


295 




300 




305 
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Leu 


Trp 


Arg 


He 


Lys 


Asn 


Gly 


Asp 


Gin 


Asp 


Thr 


Leu 


Lys 


Gly Leu 




310 










315 










320 










Met 


His 


Ala 


Leu 


Lys 


His 


Ser 


Lys 


Thr 


Tyr 


His 


Phe 


Pro 


Lys Thr 


5 


325 










330 










335 










val 


Thr 


Gin 


Ser 


Leu 


Lys 


Lys 


Thr 


He 


Arg 


Phe 


Leu 


His 


Ser Phe 




340 










345 










350 










Thr 


Met 


Tyr Lys 


Leu 


Tyr 


Gin 


Lys 


Leu 


Phe 


Leu 


Glu 


Met 


He Gly 




355 










360 










365 








10 


Asn 
370 


Gin 


Val 


Gin 


Ser 


Val 
375 


Lys 


He 


Ser 


Cya 


Leu 
380 









(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1206 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLEC0LB TYPE: cDMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 



ATGAACAACT TGCTGTGCTG C6CGCTCGTG 

25 CAGGAAACGT TTCCTCCAAA GTACCTTCAT 

TGTGACAAAT GTCCTCCTGG TACCTACCTA 
GTGTGCGCCC CTTGCCCTGA CCACTACTAC 
CTATACTGCA GCCCCGTGTG CAAGGAGCTG 
CACAACCGCG T6TGCGAATG CAAGGAAGGG 

30 CATAGGAGCT GCCCTCCTGG ATTTGGAGTG 

GTTTGCAAAA GATGTCCAGA TGGGTTCTTC 
AGAAAACACA CAAATTGCAG TGTCTTTGGT 
CACGACAACA TATGTTCCGG AAACAGTGAA 
CTGTGTGAGG AGGCATTCTT CA6GTTTGCT 

2^ AGTGTCTTGG TA6ACAATTT GCCTGGCACC 

AAACGGCAAC ACAGCTCACA AGAACAGACT 
AACAAAGACC AAGATATAGT CAAGAAGATC 
GTGCAGCGGC ACATTGGACA TGCTAACCTC 
AGCTTACCGG GAAAGAAAGT GGGAGCAGAA 
CCCAGTGACC AGATCCTGAA GCTGCTCAGT 

^ ACCTTGAAGG GCCTAATGCA CGCACTAAAG 

GTCACTCAGA GTCTAAAGAA GACCATCAGG 
TATCAGAAGT TATTTTTAGA AATGATAGGT 
TTATAA 



TTTCTGGACA TCTCCATTAA GT6GACCACC 60 
TATGACGAAG AAACCTCTCA TCAGCTGTTG 120 
AAACAACACT GTACAGCAAA GTGGAAGACC 180 
ACAGACAGCT GGCACACCAG TGACGAGTGT 240 
CAGTACGTCA AGCAGGAGTG CAATCGCACC 300 
CGCTACCTTG AGATAGAGTT CTGCTTGAAA 360 
GTGCAAGCTG GAACCCCAGA GCGAAATACA 420 
TCAAATGAGA CGTCATCTAA AGCACCCTGT 480 
CTCCTGCTAA CTCAGAAAGG AAATGCAACA 540 
TCAACTCAAA AATGTGGAAT AGATGTTACC 600 
GTTCCTACAA AGTTTACGCC TAACTGGCTT 660 
AAA6TAAACG CAGAGAGTGT AGAGAGGATA 720 
TTCCAGCTGC TGAAGTTATG GAAACATCAA 780 
ATCCAAGATA TTGACCTCTG TGAAAACAGC 840 
ACCTTCGAGC AGCTTCGTAG CTTGATGGAA 900 
GACATTGAAA AAACAATAAA GGCATGCAAA 960 
TTGTGGC6AA TAAAAAATGG CGACCAAQAC 1020 
CACTCAAAGA CGTACCACTT TCCCAAAACT 1080 
TTCCTTCACA GCTTCACAAT GTACAAATTG 1140 
AACCAGGTCC AATCAGTAAA AATAAGCTGC 1200 

1206 
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Claims 



50 1 . A DNA comprising the nucleotide sequences of the Sequences No. 1 and No. 2 in the Sequence Table. 

2. The DNA according to daim 1 , wherein the Sequence ID Na 1 includes the first exon of the OC IF gene and the 
Sequence ID No. 2 includes the second, third, fourth, and fifth exona 

55 3. A protein exh3)iting the activity of inhibiting cGfferentiation and/or maturation of osteoclasts and having the following 
physicochemical characteristics. 

(a) molecular weight (SDS-PAGE) : 
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(i) Under reducing conditions: about 60 kD, 

(ii) Under non-reducing conditions: about 60 kD and about 120 kD; 

(b) antino add sequence: 

includes an amino add sequence of the Sequence ID No. 3 in the Sequence Table, 

(c) affinity: 

exhibits affinity to a cation exchanger and heparin, and 

(d) heat stability: 

(i) the osteodastogenesis-inhibitory activity reduced when treated with heat at TC'C for 10 minutes or at 
56**C for 30 minutes, 

(ii) the osteodastogenesis-inhibitory activity Is lost when treated with heat at 90*^0 for 10 minutes. 

A process for produdng a protein exhibiting an activrty of inhft)rting differentiation and/or maturation of osteodasts 
and having the following physicochemical characteri^ics, 

(a) molecular weight (SDS-PAGE): 

(i) Under reducing conditions: about 60 kD, 

Oi) Under non-redudr^ conditions: about 60 kD and about 120 kD; 

(b) amino add sequence: 

indudes an amino acid sequence of the Sequence ID No. 3 of the Sequence Table, 

(c) affinity: 

exhibits affinity to a cation exchanger and heparin, arxJ 

(d) heat stability: 

0) the osteodastogenesis-inhibitory activity ^ reduced when treated with heat at 70*C for 1 0 minutes or at 
56*C for 30 minutes, 

(ii) the osteodastogenesis-inhibitory activity is tost when treated with heat at 90''G for 1 0 minutes. 

the process comprising inserting a DNA including the nudeotide sequences of the sequences No. 1 aix) No. 2 in 
the Sequence Table into an expression vector, producing a vector capable of expressing a protein having the 
above-mentioned physicochemical charactenstics and exhtoiting the activity of inhibiting differentiation and/or mat- 
uration of osteodasts, and produdng this protein by a genetk; engineering technique 
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Figure 1 
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